library(topGO)
library(knitr)   # to use kable()
library(limma)   # to use vennDiagram()
library(ggplot2)
TAG <- params$TAG
VAR <- params$VAR
ANNOTATION <- list(genes = '../2019-07-26/genes/annotation.txt',
                   isoforms = '../2019-07-26/transcripts/annotation.txt')
TAGDIR <- paste0('../2020-01-08/', TAG, '/')

Introduction

This is the enrichment analysis for isoforms regulated by hatching. Because I quantified the association of expression with hatching in two different ways, this document has two sections. Section Variance reports the enrichment analysis performed with isoforms ordered by the proportion of expression variance explained by hatching. Note that some isoforms with low variance may still be highly associated with hatching, even if the fold change between levels of this factor is low. Section Differential expression uses an ordering of isoforms based on the significance of the differential expression between levels of hatching, which does depend on fold change.

Reading the data

Functional annotation is in 2019-07-26. I will also upload two lists of isoforms, with either proportion of variance explained by hatching or p-value of differential expression test.

tagVariance <- read.table(paste0(TAGDIR, VAR, '_variance.txt'))
tagPValue   <- read.table(paste0(TAGDIR, VAR, '_pvalue.txt'))
annotation  <- read.table(ANNOTATION[[TAG]], col.names = c('tagname', 'goterms'))

To initialize the topGOdata object, I will need the gene list as a named numeric vector, where the names are the isoforms identifiers and the numeric values, either the portion of variance explained by hatching or the p-values of the differential expression test. The structure() function adds attributes to an object.

Variance <- structure(tagVariance[,1], names = row.names(tagVariance))
PValues  <- structure(tagPValue[,1],   names = row.names(tagPValue))
rm(tagVariance, tagPValue)

There are 33037 isoforms scored with a variance portion and a differential expression p-value. It should actually be the exact same isoforms. The annotation data frame has more than one GO term for every tag, separated by |. I need a named list.

head(annotation)
##          tagname
## 1 TCONS_00000002
## 2 TCONS_00000010
## 3 TCONS_00000011
## 4 TCONS_00000016
## 5 TCONS_00000017
## 6 TCONS_00000018
##                                                             goterms
## 1                                  GO:0008417|GO:0016020|GO:0006486
## 2                                  GO:0043130|GO:0005515|GO:0043161
## 3                       GO:0005840|GO:0015935|GO:0003735|GO:0006412
## 4 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
## 5 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
## 6 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
allgenes2GO <- strsplit(as.character(annotation$goterms), "|", fixed = TRUE)
names(allgenes2GO) <- annotation$tagname
rm(annotation)

There are 19735 isoforms with GO annotations. But the differential expression analysis includes many more isoforms. In order to include the not-annotated isoforms in the enrichment analysis, to see if annotated or not annotated isoforms are more or less often differentially expressed, I should attribute a GO term to them. According to [http://geneontology.org/docs/faq/] nowadays we express lack of annotation by annotating to the root nodes, i.e. GO:0008150 biological_process, GO:0003674 molecular_function, and GO:0005575 cellular_component.

for (tag in unique(c(names(PValues), names(Variance)))) {
   if (! tag %in% names(allgenes2GO)) {
      allgenes2GO <- append(allgenes2GO,
         structure(list(c("GO:0008150", "GO:0003674", "GO:0005575")), names = tag))
   }
}

Using differential expression p-values

Building the topGO object

Creation of a topGO dataset is documented in section 4 of topGO’s the user manual: https://bioconductor.org/packages/release/bioc/vignettes/topGO/inst/doc/topGO.pdf. I need to use the new command, and fill up the slots. The annot object must be a function that compiles “a list of GO terms such that each element in the list is a character vector containing all the gene identifiers that are mapped to the respective GO term.” There are several options, that you can check using help(annFUN.gene2GO), for example. The annFUN.gene2GO requires a gene2GO argument, which is the list of gene-to-GO terms I made before. The geneSelectionFun object is a function that selects the interesting (most significant) genes. It is required to perform Fisher’s exact test. The nodeSize is used to prune the GO hierarchy from the terms which have less than n annotated genes.

I generate three datasets, to analyse each of the three ontologies.

selection <- function(allScores) {return(allScores < 0.05)}
GOdata.BP <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'BP', 
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdata.MF <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'MF',
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdata.CC <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'CC',
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
   Num_Genes = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), numGenes),
   Num_GO_terms = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
Number of feasible genes or transcripts and number of GO terms used in each data set.
ontology Num_Genes Num_GO_terms
BP 26235 1582
MF 30580 791
CC 21616 390

Running the tests

There are more than one way to test for enrichment. Something that took me a while to understand is that not only there are different test statistics (Fisher’s exact test, Kolmogorov-Smirnov, and others) but also different algorithms: classic, elim, weight… The algorithms are ways to deal with the dependence structure among GO terms due to topology. Some algorithms are compatible with all statistics implemented in topGO. But the weight and the parentchild algorithms are only applicable to Fisher’s exact test. I am not interested in the classic algorithm, which treats GO nodes as independent, and therefore produces an excess of significant results. I will not use the Fisher’s exact test, because it dependes on an arbitrary threshold of significance on non-adjusted p-values.

BP.elim     <- runTest(GOdata.BP, algorithm = "elim",     statistic = "ks")
BP.weight01 <- runTest(GOdata.BP, algorithm = "weight01", statistic = "ks")
BP.lea      <- runTest(GOdata.BP, algorithm = "lea",      statistic = "ks")
MF.elim     <- runTest(GOdata.MF, algorithm = "elim",     statistic = "ks")
MF.weight01 <- runTest(GOdata.MF, algorithm = "weight01", statistic = "ks")
MF.lea      <- runTest(GOdata.MF, algorithm = "lea",      statistic = "ks")
CC.elim     <- runTest(GOdata.CC, algorithm = "elim",     statistic = "ks")
CC.weight01 <- runTest(GOdata.CC, algorithm = "weight01", statistic = "ks")
CC.lea      <- runTest(GOdata.CC, algorithm = "lea",      statistic = "ks")

ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
   algorithm = rep(c("elim", "weight01", "lea"), 3),
   TermsTested = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) length(score(X))),
   Significant = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) sum(score(X) < 0.01)))

kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.
ontology algorithm TermsTested Significant
BP elim 1582 24
BP weight01 1582 10
BP lea 1582 37
MF elim 791 10
MF weight01 791 13
MF lea 791 13
CC elim 390 2
CC weight01 390 1
CC lea 390 3
rm(ResultsSummary)

Results

The topGO package does not pay much attention to what terms are significant because they are overrepresented and which ones are underrepresented among the most differentially expressed genes. I think it’s worth separating them, to facilitate the biological interpretation. Note that not all terms listed in the tables below are significant. The scores for the three methods (elim, weight01 and lea) are non-corrected p-values.

Biological process

orderedTerms <- names(sort(score(BP.weight01)))
significant.weight01 <- score(BP.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(BP.lea)[orderedTerms] <= 0.01
significant.elim     <- score(BP.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.pvalue.sigTerms <- sigTerms

BP.all <- GenTable(GOdata.BP, elim=BP.elim, weight01=BP.weight01, lea=BP.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

kable(
   BP.all[BP.all$Significant > BP.all$Expected,],
   caption = "Most over-represented terms of the Biological Process ontology.")
Most over-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0001522 pseudouridine synthesis 31 4 1.57 2 0.0014 0.0014 0.00143
2 GO:0046854 phosphatidylinositol phosphorylation 45 4 2.27 4 0.0027 0.0027 0.00272
3 GO:0006400 tRNA modification 58 7 2.93 45 0.0163 0.0029 0.01633
4 GO:0050684 regulation of mRNA processing 15 3 0.76 12 0.0059 0.0038 0.00592
5 GO:0030513 positive regulation of BMP signaling pat… 9 4 0.45 8 0.0052 0.0052 0.00525
6 GO:0060271 cilium assembly 73 4 3.69 7 0.0049 0.0068 0.00038
7 GO:0016579 protein deubiquitination 90 9 4.55 14 0.0070 0.0070 0.00696
8 GO:0048015 phosphatidylinositol-mediated signaling 40 4 2.02 20 0.0081 0.0081 0.00806
9 GO:0070286 axonemal dynein complex assembly 9 1 0.45 21 0.0082 0.0082 0.00823
10 GO:0055072 iron ion homeostasis 21 3 1.06 622 0.4004 0.0094 0.40036
11 GO:0007411 axon guidance 11 3 0.56 27 0.0118 0.0118 0.01182
13 GO:0015937 coenzyme A biosynthetic process 8 1 0.40 41 0.0147 0.0147 0.01467
14 GO:0007275 multicellular organism development 85 7 4.29 108 0.0469 0.0155 0.00208
15 GO:0046434 organophosphate catabolic process 52 3 2.63 893 0.5923 0.0158 0.59232
16 GO:0060828 regulation of canonical Wnt signaling pa… 5 2 0.25 46 0.0168 0.0168 0.01683
18 GO:0006511 ubiquitin-dependent protein catabolic pr… 204 12 10.30 325 0.2086 0.0178 0.20864
20 GO:0048017 inositol lipid-mediated signaling 50 6 2.53 53 0.0188 0.0188 0.00241
21 GO:0006488 dolichol-linked oligosaccharide biosynth… 11 1 0.56 54 0.0189 0.0189 0.01886
22 GO:0006654 phosphatidic acid biosynthetic process 10 2 0.51 55 0.0189 0.0189 0.01887
23 GO:0007040 lysosome organization 8 2 0.40 58 0.0193 0.0193 0.01933
kable(
   BP.all[BP.all$Significant < BP.all$Expected,],
   caption = "Most under-represented terms of the Biological Process ontology.")
Most under-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
12 GO:0006418 tRNA aminoacylation for protein translat… 87 2 4.39 78 0.0333 0.0124 0.03332
17 GO:0050953 sensory perception of light stimulus 11 0 0.56 846 0.5527 0.0172 0.55272
19 GO:0032543 mitochondrial translation 5 0 0.25 52 0.0186 0.0186 0.01859
24 GO:0006030 chitin metabolic process 103 5 5.20 142 0.0681 0.0222 0.06805
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term = Term(GOTERM[sigTerms]),
                 Definition = Definition(GOTERM[sigTerms]),
                 PValue=score(BP.weight01)[sigTerms]),
      caption = paste('Biological process terms significantly associated with',
                      VAR, 'according to all 3 algorithms', sep=' '))
Biological process terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0001522 pseudouridine synthesis The intramolecular conversion of uridine to pseudouridine within an RNA molecule. This posttranscriptional base modification occurs in tRNA, rRNA, and snRNAs. 0.0014295
GO:0046854 phosphatidylinositol phosphorylation The process of introducing one or more phosphate groups into a phosphatidylinositol, any glycerophosphoinositol having one phosphatidyl group esterified to one of the hydroxy groups of inositol. 0.0027221
GO:0050684 regulation of mRNA processing Any process that modulates the frequency, rate or extent of mRNA processing, those processes involved in the conversion of a primary mRNA transcript into a mature mRNA prior to its translation into polypeptide. 0.0038049
GO:0030513 positive regulation of BMP signaling pathway Any process that activates or increases the frequency, rate or extent of BMP signaling pathway activity. 0.0052479
GO:0060271 cilium assembly The assembly of a cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface. Each cilium is bounded by an extrusion of the cytoplasmic membrane, and contains a regular longitudinal array of microtubules, anchored basally in a centriole. 0.0067752
GO:0016579 protein deubiquitination The removal of one or more ubiquitin groups from a protein. 0.0069571
GO:0048015 phosphatidylinositol-mediated signaling A series of molecular signals in which a cell uses a phosphatidylinositol-mediated signaling to convert a signal into a response. Phosphatidylinositols include phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0080632
GO:0070286 axonemal dynein complex assembly The aggregation, arrangement and bonding together of a set of components to form an axonemal dynein complex, a dynein complex found in eukaryotic cilia and flagella, in which the motor domain heads interact with adjacent microtubules to generate a sliding force which is converted to a bending motion. 0.0082319

I think the GO graph is useful to see the relationship among the significant terms. But too large graphs are impossible to read. I don’t know how to split the graph in meaningful subgraphs.

showSigOfNodes(GOdata.BP, score(BP.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 276 
## Number of Edges = 533 
## 
## $complete.dag
## [1] "A graph with 276 nodes."

This is just a example of the most significant GO term:

showGroupDensity(GOdata.BP, names(score(BP.weight01))[which.min(score(BP.weight01))])

Molecular function

orderedTerms <- names(sort(score(MF.weight01)))
significant.weight01 <- score(MF.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(MF.lea)[orderedTerms] <= 0.01
significant.elim     <- score(MF.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.pvalue.sigTerms <- sigTerms

MF.all <- GenTable(GOdata.MF, elim=MF.elim, weight01=MF.weight01, lea=MF.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

kable(
   MF.all[MF.all$Significant > MF.all$Expected,],
   caption = "Most over-represented terms of the Molecular Function ontology.")
Most over-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0016747 transferase activity, transferring acyl … 158 9 7.84 149 0.18300 0.00036 0.20455
3 GO:0035091 phosphatidylinositol binding 109 9 5.41 1 0.00025 0.00045 0.00025
5 GO:0005509 calcium ion binding 638 33 31.65 3 0.00156 0.00156 0.00156
7 GO:0009982 pseudouridine synthase activity 27 2 1.34 5 0.00300 0.00300 0.00300
9 GO:0060090 molecular adaptor activity 45 6 2.23 16 0.01539 0.00449 0.01539
10 GO:0005515 protein binding 4270 224 211.82 28 0.02495 0.00528 0.01108
kable(
   MF.all[MF.all$Significant < MF.all$Expected,],
   caption = "Most under-represented terms of the Molecular Function ontology.")
Most under-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
2 GO:0140101 catalytic activity, acting on a tRNA 139 4 6.90 2 0.00096 0.00041 0.00096
4 GO:0004812 aminoacyl-tRNA ligase activity 90 2 4.46 17 0.01613 0.00098 0.01613
6 GO:0004594 pantothenate kinase activity 5 0 0.25 4 0.00283 0.00283 0.00283
8 GO:0008417 fucosyltransferase activity 42 2 2.08 6 0.00422 0.00422 0.00422
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(MF.weight01)[sigTerms]),
      caption = paste('Molecular function terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Molecular function terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0140101 catalytic activity, acting on a tRNA NA 0.0004062
GO:0035091 phosphatidylinositol binding Interacting selectively and non-covalently with any inositol-containing glycerophospholipid, i.e. phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0004529
GO:0005509 calcium ion binding Interacting selectively and non-covalently with calcium ions (Ca2+). 0.0015580
GO:0004594 pantothenate kinase activity Catalysis of the reaction: ATP + pantothenate = ADP + D-4’-phosphopantothenate. 0.0028317
GO:0009982 pseudouridine synthase activity Catalysis of the reaction: RNA uridine = RNA pseudouridine. Conversion of uridine in an RNA molecule to pseudouridine by rotation of the C1’-N-1 glycosidic bond of uridine in RNA to a C1’-C5. 0.0030029
GO:0008417 fucosyltransferase activity Catalysis of the transfer of a fucosyl group to an acceptor molecule, typically another carbohydrate or a lipid. 0.0042250
GO:0004630 phospholipase D activity Catalysis of the reaction: a phosphatidylcholine + H2O = choline + a phosphatidate. 0.0059711
showSigOfNodes(GOdata.MF, score(MF.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 38 
## Number of Edges = 41 
## 
## $complete.dag
## [1] "A graph with 38 nodes."
showGroupDensity(GOdata.MF, names(score(MF.weight01))[which.min(score(MF.weight01))])

Cellular component

orderedTerms <- names(sort(score(CC.weight01)))
significant.weight01 <- score(CC.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(CC.lea)[orderedTerms] <= 0.01
significant.elim     <- score(CC.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.pvalue.sigTerms <- sigTerms

CC.all <- GenTable(GOdata.CC, elim=CC.elim, weight01=CC.weight01, lea=CC.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
   CC.all[CC.all$Significant > CC.all$Expected,],
   caption = "Most over-represented terms of the Cellular Component ontology.")
Most over-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:1902555 endoribonuclease complex 10 1 0.49 10 0.020 0.004 0.020
GO:0005576 extracellular region 197 13 9.56 15 0.026 0.014 0.026
kable(
   CC.all[CC.all$Significant < CC.all$Expected,],
   caption = "Most under-represented terms of the Cellular Component ontology.")
Most under-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(CC.weight01)[sigTerms]),
      caption = paste('Cellular component terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Cellular component terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
showSigOfNodes(GOdata.CC, score(CC.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 6 
## Number of Edges = 5 
## 
## $complete.dag
## [1] "A graph with 6 nodes."
showGroupDensity(GOdata.CC, names(score(CC.weight01))[which.min(score(CC.weight01))])

Using the portion of variance explained by hatching

Building the topGO object

I need to generate the topGO objects again, using the alternative gene ordering, based on the proportion of expression-level variance explained by hatching. I miss a way to inform the topGOdata object that the score in this case is reversed, with respect to p-values: the higher it is, the more differentially expressed the gene is. To make sure that GO terms are tested in the same way than when using p-values, I will just reverse the proportion of variance explained by hatching to its complement. Taking this into account, the subset of interesting genes (selection() function) must be defined as the lowest 10% scores, which are the 10% genes with largest portion of variance explained by hatching.

selection <- function(allScores) {return(allScores <= quantile(allScores, probs = 0.10))}
GOdataVar.BP <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'BP', 
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdataVar.MF <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'MF',
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdataVar.CC <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'CC',
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

library(knitr)
DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
   Num_Genes = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), numGenes),
   Num_GO_terms = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
Number of feasible genes or transcripts and number of GO terms used in each data set.
ontology Num_Genes Num_GO_terms
BP 26235 1582
MF 30580 791
CC 21616 390

Running the tests

BPvar.elim     <- runTest(GOdataVar.BP, algorithm = "elim",     statistic = "ks")
BPvar.weight01 <- runTest(GOdataVar.BP, algorithm = "weight01", statistic = "ks")
BPvar.lea      <- runTest(GOdataVar.BP, algorithm = "lea",      statistic = "ks")
MFvar.elim     <- runTest(GOdataVar.MF, algorithm = "elim",     statistic = "ks")
MFvar.weight01 <- runTest(GOdataVar.MF, algorithm = "weight01", statistic = "ks")
MFvar.lea      <- runTest(GOdataVar.MF, algorithm = "lea",      statistic = "ks")
CCvar.elim     <- runTest(GOdataVar.CC, algorithm = "elim",     statistic = "ks")
CCvar.weight01 <- runTest(GOdataVar.CC, algorithm = "weight01", statistic = "ks")
CCvar.lea      <- runTest(GOdataVar.CC, algorithm = "lea",      statistic = "ks")

ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
   algorithm = rep(c("elim", "weight01", "lea"), 3),
   TermsTested = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) length(score(X))),
   Significant = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) sum(score(X) < 0.01)))

kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.
ontology algorithm TermsTested Significant
BP elim 1582 33
BP weight01 1582 18
BP lea 1582 66
MF elim 791 23
MF weight01 791 20
MF lea 791 29
CC elim 390 10
CC weight01 390 3
CC lea 390 18
rm(ResultsSummary)

Results

Biological process

orderedTerms <- names(sort(score(BPvar.weight01)))
significant.weight01 <- score(BPvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(BPvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(BPvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.variance.sigTerms <- sigTerms

BPvar.all <- GenTable(GOdataVar.BP, elim=BPvar.elim, weight01=BPvar.weight01, lea=BPvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

kable(
   BPvar.all[BPvar.all$Significant > BPvar.all$Expected,],
   caption = "Most over-represented terms of the Biological Process ontology.")
Most over-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0006511 ubiquitin-dependent protein catabolic pr… 204 33 20.21 9 0.00217 2.2e-05 0.00217
2 GO:0030513 positive regulation of BMP signaling pat… 9 3 0.89 1 3.6e-05 3.6e-05 3.6e-05
3 GO:0016579 protein deubiquitination 90 19 8.92 2 8.2e-05 8.2e-05 8.2e-05
5 GO:0007411 axon guidance 11 3 1.09 5 0.00074 0.00074 0.00074
6 GO:0007264 small GTPase mediated signal transductio… 228 31 22.59 10 0.00297 0.00123 0.00297
7 GO:0046434 organophosphate catabolic process 52 6 5.15 362 0.15812 0.00147 0.15812
8 GO:0032543 mitochondrial translation 5 1 0.50 8 0.00192 0.00192 0.00192
11 GO:0001522 pseudouridine synthesis 31 9 3.07 13 0.00328 0.00328 0.00328
12 GO:0006400 tRNA modification 58 12 5.75 59 0.01725 0.00335 0.01725
13 GO:0006997 nucleus organization 5 3 0.50 14 0.00342 0.00342 0.00342
14 GO:0035307 positive regulation of protein dephospho… 5 2 0.50 18 0.00516 0.00516 0.00516
16 GO:0046907 intracellular transport 422 60 41.81 139 0.04443 0.00660 0.00088
17 GO:0015937 coenzyme A biosynthetic process 8 3 0.79 24 0.00668 0.00668 0.00668
18 GO:0034474 U2 snRNA 3’-end processing 17 4 1.68 30 0.00915 0.00915 0.00915
20 GO:0006397 mRNA processing 158 35 15.65 12 0.00322 0.01045 0.00322
22 GO:0030163 protein catabolic process 241 35 23.87 7 0.00137 0.01230 5.1e-05
24 GO:0050953 sensory perception of light stimulus 11 2 1.09 824 0.46346 0.01444 0.46346
25 GO:0007029 endoplasmic reticulum organization 13 2 1.29 49 0.01522 0.01522 0.01522
26 GO:0038007 netrin-activated signaling pathway 5 1 0.50 50 0.01527 0.01527 0.01527
27 GO:0034314 Arp2/3 complex-mediated actin nucleation 17 4 1.68 52 0.01539 0.01539 0.01539
28 GO:0035195 gene silencing by miRNA 5 1 0.50 53 0.01546 0.01546 0.01546
29 GO:0070286 axonemal dynein complex assembly 9 2 0.89 56 0.01634 0.01634 0.01634
30 GO:0046854 phosphatidylinositol phosphorylation 45 11 4.46 63 0.01918 0.01918 0.01918
31 GO:0006904 vesicle docking involved in exocytosis 22 6 2.18 67 0.01959 0.01959 0.01959
33 GO:0051270 regulation of cellular component movemen… 5 1 0.50 68 0.01975 0.01975 0.01975
kable(
   BPvar.all[BPvar.all$Significant < BPvar.all$Expected,],
   caption = "Most under-represented terms of the Biological Process ontology.")
Most under-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
4 GO:0055072 iron ion homeostasis 21 2 2.08 514 0.24052 0.00011 0.24052
9 GO:0006030 chitin metabolic process 103 10 10.20 21 0.00606 0.00202 0.00606
10 GO:0006486 protein glycosylation 106 10 10.50 6 0.00083 0.00292 0.00083
15 GO:0060271 cilium assembly 73 6 7.23 3 0.00039 0.00532 0.00039
19 GO:0006545 glycine biosynthetic process 7 0 0.69 34 0.01016 0.01016 0.01016
21 GO:0010972 negative regulation of G2/M transition o… 5 0 0.50 37 0.01103 0.01103 0.01103
23 GO:1903047 mitotic cell cycle process 86 8 8.52 631 0.32396 0.01245 0.65910
32 GO:0006418 tRNA aminoacylation for protein translat… 87 7 8.62 60 0.01839 0.01962 0.01839
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(BPvar.weight01)[sigTerms]),
      caption = paste('Biological process terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Biological process terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0006511 ubiquitin-dependent protein catabolic process The chemical reactions and pathways resulting in the breakdown of a protein or peptide by hydrolysis of its peptide bonds, initiated by the covalent attachment of a ubiquitin group, or multiple ubiquitin groups, to the protein. 0.0000222
GO:0030513 positive regulation of BMP signaling pathway Any process that activates or increases the frequency, rate or extent of BMP signaling pathway activity. 0.0000357
GO:0016579 protein deubiquitination The removal of one or more ubiquitin groups from a protein. 0.0000819
GO:0007411 axon guidance The chemotaxis process that directs the migration of an axon growth cone to a specific target site in response to a combination of attractive and repulsive cues. 0.0007397
GO:0007264 small GTPase mediated signal transduction Any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. 0.0012329
GO:0032543 mitochondrial translation The chemical reactions and pathways resulting in the formation of a protein in a mitochondrion. This is a ribosome-mediated process in which the information in messenger RNA (mRNA) is used to specify the sequence of amino acids in the protein; the mitochondrion has its own ribosomes and transfer RNAs, and uses a genetic code that differs from the nuclear code. 0.0019207
GO:0006030 chitin metabolic process The chemical reactions and pathways involving chitin, a linear polysaccharide consisting of beta-(1->4)-linked N-acetyl-D-glucosamine residues. 0.0020236
GO:0006486 protein glycosylation A protein modification process that results in the addition of a carbohydrate or carbohydrate derivative unit to a protein amino acid, e.g. the addition of glycan chains to proteins. 0.0029192
GO:0001522 pseudouridine synthesis The intramolecular conversion of uridine to pseudouridine within an RNA molecule. This posttranscriptional base modification occurs in tRNA, rRNA, and snRNAs. 0.0032848
GO:0006997 nucleus organization A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the nucleus. 0.0034201
GO:0035307 positive regulation of protein dephosphorylation Any process that activates or increases the frequency, rate or extent of removal of phosphate groups from a protein. 0.0051619
GO:0060271 cilium assembly The assembly of a cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface. Each cilium is bounded by an extrusion of the cytoplasmic membrane, and contains a regular longitudinal array of microtubules, anchored basally in a centriole. 0.0053173
GO:0015937 coenzyme A biosynthetic process The chemical reactions and pathways resulting in the formation of coenzyme A, 3’-phosphoadenosine-(5’)diphospho(4’)pantatheine, an acyl carrier in many acylation and acyl-transfer reactions in which the intermediate is a thiol ester. 0.0066754
GO:0034474 U2 snRNA 3’-end processing Any process involved in forming the mature 3’ end of a U2 snRNA molecule. 0.0091518
showSigOfNodes(GOdataVar.BP, score(BPvar.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 381 
## Number of Edges = 772 
## 
## $complete.dag
## [1] "A graph with 381 nodes."

Below I plot variance portion, but for the termo found most significant when using p-values, for comparison.

showGroupDensity(GOdataVar.BP, names(score(BP.weight01))[which.min(score(BP.weight01))])

Molecular function

orderedTerms <- names(sort(score(MFvar.weight01)))
significant.weight01 <- score(MFvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(MFvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(MFvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.variance.sigTerms <- sigTerms

MFvar.all <- GenTable(GOdataVar.MF, elim=MFvar.elim, weight01=MFvar.weight01, lea=MFvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
   MFvar.all[MFvar.all$Significant > MFvar.all$Expected,],
   caption = "Most over-represented terms of the Molecular Function ontology.")
Most over-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0005515 protein binding 4270 487 418.76 4 0.00056 5.9e-05 0.00056
2 GO:0036459 thiol-dependent ubiquitinyl hydrolase ac… 103 23 10.10 1 3e-05 6.7e-05 3e-05
4 GO:0035091 phosphatidylinositol binding 109 23 10.69 2 0.00033 0.00055 0.00033
6 GO:0004298 threonine-type endopeptidase activity 31 4 3.04 6 0.00087 0.00087 0.00087
7 GO:0140101 catalytic activity, acting on a tRNA 139 16 13.63 10 0.00161 0.00118 0.00161
9 GO:0004594 pantothenate kinase activity 5 2 0.49 9 0.00142 0.00142 0.00142
10 GO:0003676 nucleic acid binding 2125 239 208.40 3 0.00052 0.00238 0.00052
11 GO:0016417 S-acyltransferase activity 6 4 0.59 11 0.00247 0.00247 0.00247
12 GO:0016742 hydroxymethyl-, formyl- and related tran… 9 3 0.88 13 0.00276 0.00276 0.00276
13 GO:0016887 ATPase activity 327 43 32.07 212 0.24838 0.00715 0.24838
15 GO:0060090 molecular adaptor activity 45 13 4.41 7 0.00121 0.00840 0.00121
16 GO:0005220 inositol 1,4,5-trisphosphate-sensitive c… 15 2 1.47 18 0.00879 0.00879 0.00879
17 GO:0070679 inositol 1,4,5 trisphosphate binding 15 2 1.47 19 0.00879 0.00879 0.00879
18 GO:0005516 calmodulin binding 52 10 5.10 20 0.00948 0.00948 0.00948
19 GO:0003993 acid phosphatase activity 10 1 0.98 22 0.00974 0.00974 0.00974
20 GO:0004312 fatty acid synthase activity 6 3 0.59 23 0.00998 0.00998 0.00998
21 GO:0046527 glucosyltransferase activity 17 4 1.67 16 0.00693 0.01026 0.00693
22 GO:0005102 signaling receptor binding 47 7 4.61 112 0.13378 0.01048 0.13378
23 GO:0016748 succinyltransferase activity 5 3 0.49 24 0.01065 0.01065 0.01065
kable(
   MFvar.all[MFvar.all$Significant < MFvar.all$Expected,],
   caption = "Most under-represented terms of the Molecular Function ontology.")
Most under-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
3 GO:0038023 signaling receptor activity 471 34 46.19 325 0.41565 0.00025 0.48567
5 GO:0008061 chitin binding 112 10 10.98 5 0.00061 0.00061 0.00061
8 GO:0008417 fucosyltransferase activity 42 4 4.12 8 0.00136 0.00136 0.00136
14 GO:0004812 aminoacyl-tRNA ligase activity 90 7 8.83 42 0.03055 0.00828 0.03055
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(MFvar.weight01)[sigTerms]),
      caption = paste('Molecular functions terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Molecular functions terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0005515 protein binding Interacting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules). 0.0000586
GO:0036459 thiol-dependent ubiquitinyl hydrolase activity Catalysis of the thiol-dependent hydrolysis of an ester, thioester, amide, peptide or isopeptide bond formed by the C-terminal glycine of ubiquitin. 0.0000670
GO:0035091 phosphatidylinositol binding Interacting selectively and non-covalently with any inositol-containing glycerophospholipid, i.e. phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0005496
GO:0008061 chitin binding Interacting selectively and non-covalently with chitin, a linear polysaccharide consisting of beta-(1->4)-linked N-acetyl-D-glucosamine residues. 0.0006105
GO:0004298 threonine-type endopeptidase activity Catalysis of the hydrolysis of internal peptide bonds in a polypeptide chain by a mechanism in which the hydroxyl group of a threonine residue at the active center acts as a nucleophile. 0.0008677
GO:0140101 catalytic activity, acting on a tRNA NA 0.0011815
GO:0008417 fucosyltransferase activity Catalysis of the transfer of a fucosyl group to an acceptor molecule, typically another carbohydrate or a lipid. 0.0013571
GO:0004594 pantothenate kinase activity Catalysis of the reaction: ATP + pantothenate = ADP + D-4’-phosphopantothenate. 0.0014241
GO:0003676 nucleic acid binding Interacting selectively and non-covalently with any nucleic acid. 0.0023808
GO:0016417 S-acyltransferase activity Catalysis of the transfer of an acyl group to a sulfur atom on the acceptor molecule. 0.0024717
GO:0016742 hydroxymethyl-, formyl- and related transferase activity Catalysis of the transfer of a hydroxymethyl- or formyl group from one compound (donor) to another (acceptor). 0.0027562
GO:0060090 molecular adaptor activity The binding activity of a molecule that brings together two or more molecules through a selective, non-covalent, often stoichiometric interaction, permitting those molecules to function in a coordinated way. 0.0083968
GO:0005220 inositol 1,4,5-trisphosphate-sensitive calcium-release channel activity Enables the transmembrane transfer of a calcium ion by a channel that opens when inositol 1,4,5-trisphosphate (IP3) has been bound by the channel complex or one of its constituent parts. 0.0087864
GO:0070679 inositol 1,4,5 trisphosphate binding Interacting selectively and non-covalently with inositol 1,4,5 trisphosphate. 0.0087864
GO:0005516 calmodulin binding Interacting selectively and non-covalently with calmodulin, a calcium-binding protein with many roles, both in the calcium-bound and calcium-free states. 0.0094780
GO:0003993 acid phosphatase activity Catalysis of the reaction: an orthophosphoric monoester + H2O = an alcohol + phosphate, with an acid pH optimum. 0.0097424
GO:0004312 fatty acid synthase activity Catalysis of the reaction: acetyl-CoA + n malonyl-CoA + 2n NADPH + 2n H+ = long-chain fatty acid + n+1 CoA + n CO2 + 2n NADP+. 0.0099792
showSigOfNodes(GOdataVar.MF, score(MFvar.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 87 
## Number of Edges = 107 
## 
## $complete.dag
## [1] "A graph with 87 nodes."

I plot variance portion, but for the term found most significant when using p-values, for comparison.

showGroupDensity(GOdataVar.MF, names(score(MF.weight01))[which.min(score(MF.weight01))])

Cellular component

orderedTerms <- names(sort(score(CCvar.weight01)))
significant.weight01 <- score(CCvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(CCvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(CCvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
if (length(sigTerms) == 0) {
   sigTerms <- orderedTerms[significant.lea & significant.elim]
}
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.variance.sigTerms <- sigTerms

CCvar.all <- GenTable(GOdataVar.CC, elim=CCvar.elim, weight01=CCvar.weight01, lea=CCvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=max(sum(significant.elim), 10))

kable(
   CCvar.all[CCvar.all$Significant > CCvar.all$Expected,],
   caption = "Most over-represented terms of the Cellular Component ontology.")
Most over-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
2 GO:0005737 cytoplasm 1176 121 110.82 22 0.0242 0.0073 0.00071
3 GO:0045239 tricarboxylic acid cycle enzyme complex 5 3 0.47 10 0.0100 0.0100 0.00996
4 GO:0005815 microtubule organizing center 96 15 9.05 28 0.0286 0.0105 0.15049
5 GO:1902555 endoribonuclease complex 10 3 0.94 46 0.0602 0.0105 0.06022
6 GO:0005761 mitochondrial ribosome 18 4 1.70 3 0.0035 0.0111 0.00347
7 GO:0005576 extracellular region 197 21 18.56 27 0.0282 0.0130 0.02818
8 GO:0030015 CCR4-NOT core complex 15 3 1.41 13 0.0153 0.0153 0.01528
10 GO:0031981 nuclear lumen 280 37 26.39 24 0.0253 0.0162 0.02534
kable(
   CCvar.all[CCvar.all$Significant < CCvar.all$Expected,],
   caption = "Most under-represented terms of the Cellular Component ontology.")
Most under-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0016272 prefoldin complex 7 0 0.66 6 0.0053 0.0053 0.00531
9 GO:0005802 trans-Golgi network 12 1 1.13 14 0.0158 0.0158 0.01577
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(CCvar.weight01)[sigTerms]),
      caption = paste('Cellular component terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Cellular component terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0016272 prefoldin complex A multisubunit chaperone that is capable of delivering unfolded proteins to cytosolic chaperonin, which it acts as a cofactor for. In humans, the complex is a heterohexamer of two PFD-alpha and four PFD-beta type subunits. In Saccharomyces cerevisiae, it also acts in the nucleus to regulate the rate of elongation by RNA polymerase II via a direct effect on histone dynamics. 0.0053057
GO:0045239 tricarboxylic acid cycle enzyme complex Any of the heteromeric enzymes that act in the TCA cycle. 0.0099588
showSigOfNodes(GOdataVar.CC, score(CCvar.weight01),
               firstSigNodes = sum(significant.weight01),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 10 
## Number of Edges = 13 
## 
## $complete.dag
## [1] "A graph with 10 nodes."

For comparison, I plot the distribution of variance portion for the CC term found most significant when using p-values.

showGroupDensity(GOdataVar.CC, names(score(CC.weight01))[which.min(score(CC.weight01))])

Comparison between the two ordering of genes.

Biological process

allTerms <- usedGO(GOdata.BP)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% BP.pvalue.sigTerms,
                             Variance = allTerms %in% BP.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(BP.weight01))[allTerms],
                         Variance = rank(score(BPvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth() + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of BP terms by significance')

Molecular function

allTerms <- usedGO(GOdata.MF)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% MF.pvalue.sigTerms,
                             Variance = allTerms %in% MF.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(MF.weight01))[allTerms],
                         Variance = rank(score(MFvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth() + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of MF terms by significance')

Cellular component

allTerms <- usedGO(GOdata.CC)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% CC.pvalue.sigTerms,
                             Variance = allTerms %in% CC.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(CC.weight01))[allTerms],
                         Variance = rank(score(CCvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth() + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of CC terms by significance')

Session info

I save the main variables to be able to load them in a new R session later.

save(allgenes2GO,
     GOdata.BP, BP.elim, BP.weight01, BP.lea, BP.pvalue.sigTerms,
     GOdata.MF, MF.elim, MF.weight01, MF.lea, MF.pvalue.sigTerms,
     GOdata.CC, CC.elim, CC.weight01, CC.lea, CC.pvalue.sigTerms,
     GOdataVar.BP, BPvar.elim, BPvar.weight01, BPvar.lea, BP.variance.sigTerms,
     GOdataVar.MF, MFvar.elim, MFvar.weight01, MFvar.lea, MF.variance.sigTerms,
     GOdataVar.CC, CCvar.elim, CCvar.weight01, CCvar.lea, CC.variance.sigTerms,
     file = paste('Enrichment', TAG, VAR, 'RData', sep='.'))
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=ca_ES.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=ca_ES.UTF-8    
##  [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=ca_ES.UTF-8   
##  [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
##  [1] grid      stats4    parallel  stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] Rgraphviz_2.30.0     ggplot2_3.2.1        limma_3.42.0        
##  [4] knitr_1.27           topGO_2.38.1         SparseM_1.78        
##  [7] GO.db_3.10.0         AnnotationDbi_1.48.0 IRanges_2.20.2      
## [10] S4Vectors_0.24.3     Biobase_2.46.0       graph_1.64.0        
## [13] BiocGenerics_0.32.0 
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_0.2.5   xfun_0.12          purrr_0.3.3        splines_3.6.2     
##  [5] lattice_0.20-38    colorspace_1.4-1   vctrs_0.2.1        htmltools_0.4.0   
##  [9] yaml_2.2.0         mgcv_1.8-31        blob_1.2.1         rlang_0.4.2       
## [13] pillar_1.4.3       glue_1.3.1         withr_2.1.2        DBI_1.1.0         
## [17] bit64_0.9-7        matrixStats_0.55.0 lifecycle_0.1.0    stringr_1.4.0     
## [21] munsell_0.5.0      gtable_0.3.0       memoise_1.1.0      evaluate_0.14     
## [25] labeling_0.3       highr_0.8          Rcpp_1.0.3         backports_1.1.5   
## [29] scales_1.1.0       farver_2.0.3       bit_1.1-15.1       digest_0.6.23     
## [33] stringi_1.4.5      dplyr_0.8.3        tools_3.6.2        magrittr_1.5      
## [37] lazyeval_0.2.2     tibble_2.1.3       RSQLite_2.2.0      crayon_1.3.4      
## [41] pkgconfig_2.0.3    zeallot_0.1.0      Matrix_1.2-18      assertthat_0.2.1  
## [45] rmarkdown_2.1      R6_2.4.1           nlme_3.1-143       compiler_3.6.2